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MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES FOR 
REVERSIBLE STOCHASTIC DIFFERENTIAL EQUATIONS* 

Markos Katsoulakis 1 , Yannis Pantazis 1 and Luc Rey-Bellet 1 



Abstract. For a Markov process the detailed balance condition is equivalent to the time-reversibility 
of the process. For stochastic differential equations (SDE's) time discretization numerical schemes 
usually destroy the property of time-reversibility. Despite an extensive literature on the numerical 
analysis for SDE's, their stability properties, strong and/or weak error estimates, large deviations and 
infinite-time estimates, no quantitative results are known on the lack of reversibility of the discrete-time 
approximation process. In this paper we provide such quantitative estimates by using the concept of 
entropy production rate, inspired by ideas from non-equilibrium statistical mechanics. The entropy 
production rate for a stochastic process is defined as the relative entropy (per unit time) of the path 
measure of the process with respect to the path measure of the time-reversed process. By construction 
the entropy production rate is nonnegative and it vanishes if and only if the process is reversible. 
Crucially, from a numerical point of view, the entropy production rate is an a posteriori quantity, hence 
\Q ■ it can be computed in the course of a simulation as the ergodic average of a certain functional of the 

I/") 1 process (the so-called Gallavotti- Cohen (GC) action functional). We compute the entropy production 

for various numerical schemes such as explicit Euler-Maruyama and explicit Milstein's for reversible 
SDEs with additive or multiplicative noise. Additionally, we analyze the entropy production for the 
' BBK integrator of the Langevin processes. We show that entropy production is an observable that 

distinguishes between different numerical schemes in terms of their discretization-induced irreversibility. 
Furthermore, our results show that the type of the noise critically affects the behavior of the entropy 
production rate. 
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Resume. Pour un processus de Markov la condition de balance detaillee est equivalente a la re- 
versibilite du processus par rapport au renversement du temps. Pour les equations differentielles 
stochastiques, les schemas de discretisation detruisent en general cette propriete de reversibilite. En 
depit d'une vaste litterature sur l'analyse numerique des equations differentielles stochastiques, leur 
propriete de stabilite, les erreurs fortes et/ou faibles, les proprietes de grandes deviations et a long 
temps, il n'y a pas eu jusqu'a maintenant de resultats quantitatifs sur l'irreversibilite introduite par 
1' approximation numerique. Dans cet article nous fournissons de telles estimations, en nous basant sur 
le taux de production d'entropie, inspires par des idees de mecanique statistique hors-equilibre. Le 
taux de production d'entropie est, par definition, l'entropie relative (par unite de temps) du proces- 
sus par rapport au processus renverse en temps. Par construction, le taux de production d'entropie 
est non-negatif et il est zero si et seulement si le procesus est reversible. Crucialement, d'un point 
de vue numerique, le taux de production d'entropie peut etre evalue directement comme la moyenne 
ergodique d'une certaine fonctionnelle du processus (la fonctionelle de Gallavotti- Cohen), sous des 
conditions d'ergodicite adequates. Nous calculons la production d'entropie pour le schema explicite 
d'Euler-Maruyama et le schema explicite de Milstein pour des equations differentielles stochastiques 
reversibles avec des bruit additifs ou multiplicatifs. Nos resultats demontrent que le type de bruit 
change le comportement la production d'entropie de maniere critique. Finalement nous analysons la 
production d'entropie pour le schema BBK pour l'equation de Langevin. 
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Introduction 

In molecular simulations arising in the simulation of systems in materials science, chemical engineering, 
evolutionary games, computational statistical mechanics, etc. the equilibrium statistics obtained from numerical 
simulations are of great importance 6.2 2128) . For instance, the free energy of the system or free energy differences 
as well dynamic transitions between metastable states are quantities which are sampled at the stationary regime. 
In addition, physical processes are often modeled at a microscopic level as interactions between particles which 
obey a system of stochastic differential equations (SDE's) [BJH2]. To perform equilibrium simulations for the 
sampling of desirable observables, the solution of the system of SDE's must possess a (unique) ergodic invariant 
measure. The uniqueness of the invariant measure follows from the ellipticity or hypoellipticity of the generator 
of the process together with irreducibility, which means that the process can reach at some positive time any 
open subset of the state space with positive probability [16,2(3]. Under such conditions the distribution process 
converges to the invariant measure (ergodicity) which has a smooth density and the process started in the 
invariant measure is stationary, i.e. the distribution of the paths of the processes, is invariant under time- 
shift. Many processes of physical origin, such as diffusion and adsoprtion/desoprtion of interacting particles, 
satisfy the condition of detailed balance (DB), or equivalently, reversibility, i.e., the distribution of the path 
of the processes are invariant under time-reversal. It is easy to see that reversibility implies stationarity but 
is a strictly stronger condition in general. The condition of detailed balance often arises from a gradient-like 
behavior of the dynamics or from Hamiltonian dynamics if the time-reversal include reversal of the velocities. 

However, the numerical simulation of SDE's necessitates the use of numerical discretization schemes. Dis- 
cretization procedures, except in very special cases, results in the destruction of the DB condition. This affects 
the approximation process in at least two ways. First, the invariant measure of the approximation process, if 
it exists at all, is not known explicitly and, second, the time reversibility of the process is lost. Several recent 
results concerns the existence of the invariant measure for the discrete-time approximation and associated error 
estimates [2l[51 [T^lH3] but, to the best of our knowledge, there is no quantitative assessment of the irreversibility 
of the approximation process. Of course there exist Metropolized numerical schemes such as MALA [21] and 
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variations thereof which do satisfy the DB condition but they are numerically more expensive, especially in 
high-dimensional systems, as they require an accept/reject step. Thus, a quantitative understanding of the lack 
of reversibility for simpler discretization schemes can provide new insights for selecting which schemes are closer 
to satisfying the DB condition. 

The implications of irreversibility are only partially understood, both from the physical and mathematical 
point of view. These issues have emerged as a main theme in non-equilibrium statistical mechanics and it 
is well-known that irreversibility introduces a stationary current (net flow) to the system 8,18,23 but it is 
unclear how this current affects the long-time properties (i.e., the dynamics and large deviations) of the process 
such as exit times, correlation times and phase transitions of metastable states. Reversibility is a natural 
and fundamental property of physical systems and thus, if numerical simulation results in the destruction of 
reversibility, one should carefully quantify the irreversibility of the approximation process and we do in this 
paper using the entropy production rate. The entropy production rate which is defined as the relative entropy 
(per unit time) between the path measure of the process and the path measure of the reversed process is widely 
used in statistical mechanics for the study of non-equilibrium steady states of irreversible systems [5,8, lT1ll3j . A 
fundamental result on the structure of non-equilibrium steady states is the Gallavotti- Cohen fluctuation theorem 
that describes the fluctuations (of large deviations type) of the entropy production [5)181111013] and this result can 
be viewed as a generalization of the Kubo-formula and Onsager relations far from equilibrium. For our purpose, 
it is important to note that the entropy production rate is zero when the process is reversible and positive 
otherwise making entropy production rate a sensible quantitative measure of irreversibility. Furthermore, if we 
assume ergodicity of the approximation process, the entropy production rate equals the time-average of the 
Gallavotti-Cohen (GC) action functional which is defined as the logarithm of the Radon-Nikodym derivative 
between the path measure of the process and the path measure of the reversed process. A key observation of 
this paper is that an important feature of GC action functional is that it is an a posteriori quantity, hence, it is 
easily computable during the simulation making the numerical computation of entropy production rate tractable. 
We show that entropy production is a computable observable that distinguishes between different numerical 
schemes in terms of their discretization-induced irreversibility and as such allows us to adjust the discretization 
in the course of the simulation. 

We use entropy production to assess the irreversibility of various numerical schemes for reversible continuous- 
time processes. A simple class of reversible processes, yet of great interest, is the overdamped Langevin process 
with gradient- type drift (6) [3 [12]. The discretization of the process is performed using the explicit Euler- 
Maruyama (EM) scheme and we distinguish between two different cases depending on the kind of the noise. 
In the case of additive noise, under the assumption of ergodicity of the approximation process [2H3HH11I5] we 
prove that the entropy production rate is of order 0(At 2 ) where At is the time step of the numerical scheme. 
In the case of multiplicative noise, the results are remarkably different. Indeed, under ergodicity assumption, 
the entropy production rate for the explicit EM scheme is proved to have a lower positive bound which is 
independent of At. Thus irreversibility is not reduced by adjusting At, as the approximation process converges 
to the continuous-time process. The different behavior of entropy production depending on the kind of noise is 
one of the prominent findings of this paper. As a further step in our study, we formulate and test numerically the 
explicit Milstein's scheme with multiplicative noise (it is the next higher-order numerical scheme). Simulation 
results on a wide range of different multiplicative noises show that the entropy production rate of Milstein's 
scheme decreases as time step decreases with order O(At). 

Finally, we compute both analytically and numerically the entropy production rate for a discretization scheme 
for Langevin systems which is another important and widely- used class of reversible models [51112] . The Langevin 
equation is time-reversible if addition to reversing time, one reverses the sign of the velocity of all particles. 
The noise is degenerate but the process is hypo-elliptic and under mild conditions the Langevin equation is 
ergodic [T5j[T9, 26 . Our discretization scheme is an explicit EM-Verlet (symplectic)-implicit EM scheme also 
known as BBK integrator [HIE]. We rigorously prove, under ergodicity assumption of the approximation 
process, that the entropy rate produced by the numerical scheme for the Langevin process with additive noise 
is of order O(At), hence, in terms of irreversibility it can be an acceptable integration scheme. 
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75 The paper is organized in four sections. In Section Q] we recall some basic facts about reversible processes 

76 and define rigorously the entropy production. Moreover we give the basic assumption necessary for our proofs, 

77 namely, the ergodicity of both continuous-time and discrete-time approximation process. In Section [5] we 

78 compute the entropy production rate for reversible overdamped Langevin processes. The section is split into 

79 two subsections for the additive and multiplicative noise. In Section [3] we compute the entropy production rate 
so for the reversible (up to momenta flip) Langevin process using the BBK integrator. Conclusions and future 

81 extensions of the current work are summarized in the fourth and final Section. 

82 1. Reversibility, Gallavotti-Cohen Action Functional, and Entropy 

83 Production 

84 Let us consider a d-dimensional system of SDE's written as 

dX t = a(X t )dt + b(X t )dB t (1) 

85 where X t £ M d is a diffusion Markov process, a : R d — > R d is the drift vector, b : W d ->• R dxm is the diffusion 

86 matrix, and Bt £ R m is a standard m-dimensional Brownian motion. We will always assume that a and b are 

87 sufficiently smooth and satisfy suitable growth conditions and/or dissipativity conditions at infinity to ensure 

88 the existence of global solutions. The generator of the diffusion process is defined by 



J 



89 for test functions / which are twice continuously differentiable and with bounded derivatives up to second 

90 order. We assume that the process X t has a (unique) invariant measure fi(dx), and that it satisfies the Detailed 

91 Balance (DB) condition, i.e., its generator is symmetric in the Hilbert space L 2 (p), i.e. 

< £f,9 >L2( M ) = < f,£g >L»(p) ( 3 ) 

92 for suitable test functions /, g as above. 

93 A Markov process X t is said to be reversible if for any n and sequence of times t\ < ■ ■ ■ < t n the finite 

94 dimensional distributions of (X tl , X tn ) and of (X tn , X tl ) are identical. More formally, let ^ denote 

95 the path measure of the process X t on the time-interval [0,t] with Xq ~ p. Let denote the time reversal, i.e. 

96 8 acts on a path {X s }o< s <t has 

{QX) S = AVs (4) 

97 Then reversibility is equivalent to Pj^ ^ = P'^ ^ o 0. Additionally, it is well-known that a stationarjQ process 

98 which satisfies the DB condition is reversible. 

99 The condition of reversibility can be also expressed in terms of relative entropy as follows. Recall that for 

100 two probability measure ttx, tt2 on some measurable space, the relative entropy of tti with respect to 1T2 is given 

101 by R(tti\tt2) = J diri log ^£ if it\ is absolutely continuous with respect to 112 and +00 otherwise. The relative 

102 entropy is nonnegative, R(TTi\it2) > and -R(7ri|7T2) = if and only if 7Ti = 7r 2 . The entropy production rate of 

103 a Markov process X t is defined by 



1 1 f d¥ p 

EP cont := lim -i*(P f0)t] |Pf 0it] o 9) = lim - J dPf , t] log — (5) 



If X t satisfies DB and Xq ~ [i then i?(Pp t j|Pp t , o Q) is identically for all t and the entropy production 
rate is 0. Note that if A^o ~ p ^ fi then -R(Pf t]\^[o t] ©) is a boundary term, in the sense that it is O(l) 



^Stationarity is equivalent to starting the process Xt from its invariant measure, i.e., Xq ~ /1. 
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106 and so the entropy rate vanishes in this case in the large time limit (under suitable ergodicity assumptions). 

107 Conversely when EP cont ^ the process is truly irreversible. The entropy production rate for Markov processes 

108 and stochastic differential equations is discussed in more detail in [TT1IT5] . 

109 Let us consider a numerical integration scheme for the SDE (fTJ) which is written in the general form 

x i+1 = F{x u At, AWi) i = 1,2, ... (6) 

no where Xi € M. d is a discrete-time continuous state-space Markov process, At is the time-step and AWi £ M. m , i = 

in 1,2,... arc i.i.d. Gaussian random variables with mean and variance Atl m . We assume that the Markov 

112 process Xi has transition probabilities which are absolutely continuous with respect to Lebesgue measure with 

113 everywhere positive densities Tl(xi,Xi + i) := TlF(x,At,AW)i x i+i\ x i) an d we also assume that Xi has a invariant 

114 measure which we denote p,(dx) and which is then unique and has a density with respect to Lebesgue. In 
us general the invariant measure for X t and Xi differ, u ^ \x and Xi does not satisfy a DB condition. Note also 

116 that the very existence of /Z is not guaranteed in general. Results on the existence of p, do exist however and 

117 typically require that the SDE is elliptic or hypoellitptic and that the state space of X t is compact or that a 
us global Lipschitz condition on the drift holds [2,3, 14, 15 . 

119 Proceeding as in the continuous case we introduce an entropy production rate for the Markov process X4. 

120 Let us assume that the process starts from some distribution p[x)dx, then the finite dimensional distribution 

121 on the time window [0, t] where t = nAt is given by 

P[o,t](dzo, • • • ,dx n ) = p(x )H(xo,xi) ■ ■ •n(iE„_i,a;„)cfa;o • • -dx n . (7) 

122 For the time reversed path O(a;o, • • • x n ) — (x n , ■ ■ ■ ,xq) we have then 

P[o,t] o@(dx ,...,dx n ) = p(x n )IL(x n ,x n -i) ■ ■ ■U(x 1 ,x )dx ■ ■ ■ dx n (8) 

123 and the Radon-Nikodym derivative takes the form 

^ -eMW(t))^\ (9) 



dP [0,t] ° P( X n 

124 where W(t) is the Gallavotti-Cohen (GC) action functional given by 



W(t) = Win, At) := £ log ■ ( 10 ) 

i=0 1L \ x t+li x i) 

125 Note that W(t) is an additive functional of the paths and thus if X{ is ergodic, by the ergodic theorem the 

126 following limit exists 

EP(At) = lim -W(t) = lim -^—W(n;At) P - a.s.. (11) 

t-*oo t n->oo nAt 

127 We call the quantity EP(At) the entropy production rate associated to the numerical scheme. Note that we 

128 have, almost surely, 

EP(At) = ± lim -J2 ^t hXi+1 \ = 7x7 / ^g^\lL(x,y)fi(x)dxdy (12) 
At n^oo n U{x i+1 ,Xi) At J ll(y,x) 

129 and for concrete numerical schemes we will compute fairly explicitly the entropy production in the next sec- 

130 tions. Since we are interested in the ergodic average we will systematically omit boundary terms which do not 

131 contribute to ergodic averages and we will use the notation 

W 1 (t)=W 2 {t) if lim -{Wi{t)-W 2 {t)) = Q. (13) 

i— >oo t 
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132 For example we have 



133 Note also that using and (|10|) . entropy production rate is tractable numerically and it can be easily 

134 calculated "on-the-fly" once the transition probability density function LT(-, •) is provided. 

135 In the following sections we investigate the behavior of the entropy production rate for different discretization 

136 schemes of various reversible processes in the stationary regime. However, before proceeding with our analysis, 

137 let us state formally the basic assumptions necessary for our results to apply. 

138 Assumption 1.1. We have 

139 • The drift a and the diffusion b in §\§ as well as the vector F in ^ are C°° and all their derivatives 

140 have at most polynomial growth at infinity. 

141 • The generator C is elliptic or hypo-elliptic, in particular the transition probabilities and the invariant 

142 measure (if it exists) are absolutely continuous with respect to Lebesgue with smooth densities. For the 

143 discretized scheme we assume that Xi has smooth transition probabilities. 

144 • Both the continuous-time process X t and discrete-time process Xi are ergodic with unique invariant 

145 measures \i and p,, respectively. Furthermore for sufficiently small At we have 

K[f]-M Ii [f}\=0(At) (15) 

146 for functions f which are C°° with at most polynomial growth at infinity. 

147 Notice that inequality (|15j) is an error estimate for the invariant measures of the processes Xt and Xi. The 

148 rate of convergence in terms of At depends on the particular numerical scheme |14[I25] . Ergodicity results for 

149 (numerical) SDEs can be found in [2 , 3 : 9 . 14, 15 , 21 , 25 27 . For instance, if both drift term a{x) and diffusion term 

150 b(x) have bounded derivatives of any order, the covariance matrix (bb T )(x) is elliptic for all x £ M. d and there is 

151 a compact set outside of which holds x T a{x) < — C\x\ 2 for all x £ M. d (Lyapunov exponent) then it was shown 

152 in |25] that the continuous-time process as well both Euler and Milstein numerical schemes are ergodic and 

153 error estimate (|15|) holds. Another less restrictive example where ergodicity properties were proved is for SDE 

154 systems with degenerate noise and particularly for Langevin processes 15,26 . Again, a Lyapunov functional is 

155 the key assumption in order to handle the stochastic process at the infinity. More recently, Mattingly et al. [T4] 

156 showed ergodicity for SDE-driven processes restricted on a torus as well their discretizations utilizing only the 

157 assumptions of ellipticity or hypoellipticity and the assumption of local Lipschitz continuity for both drift and 

158 diffusion terms. 

159 2. Entropy Production for the Overdamped Langevin Processes 

160 The overdamped Langevin process, X t £ R d , is the solution of the following system of SDE's 

dX t = -^{X t )VV{X t )dt + -VE(X t )dt + o-(X t )dFJ t (16) 

lei where V : R d -> K is a smooth potential function, a : R d — > R dxm j s the diffusion matrix, £ := aa T : R d -> R dxd 

162 is the covariance matrix and B t is a standard m-dimensional Brownian motion. We assume from now on that 

163 £(2;) is invertible for any x so that the process is elliptic. It is straightforward to show that the generator of 

164 the process X t satisfies the DB condition Q with invariant measure 

fx(dx) = — exp(— V(x))dx (17) 

165 where Z = J Rd exp(— V(x))dx is the normalization constant and thus if Xq ~ [i then the Markov process X t is 

166 reversible. 
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a?i+i = Xi - ^E(xi)VV(xi)At + lvT,(xi)At + a(x l )AW l (18) 



168 with AWi ~ N(0, Atl m ), i = 1,2,... are m-dimensional iid Gaussian random variables. The process Xi is a 

169 discrete-time Markov process with transition probability density given by 



U(xi,x i+1 ) = — l — exp (-^-(Axi + ^(x^S/Vix^At - ^VT,( Xi )At) T 
Z(Xi) \2/\t 2 2 

^- 1 (x l )(Ax l + ^(xi)VV(xi)At - ivS(a!i)A*) 



(19) 



no where Axi — Xi+\—Xi and Z(xi) = (27r)" l / 2 | det E^)) 1 / 2 is the normalization constant for the multidimensional 

171 Gaussian distribution. The following lemma provides the GC action functional for the explicit EM time- 

172 discretization scheme of the overdamped Langevin process. 



173 Lemma 2.1. Assume that det S(x) Vie £ K d . Then the GC action functional of the process Xi solving U8\ 

174 is 



71 — 1 ^71—1 



W{n- At)= ^J[VV{x i+l ) + VV{xi)] + \Y, ^[^-\x l+1 )VY,{x l+1 ) + E- 1 (a; i )VS(a; i ) 

i=o^ i=o (20) 



4=0 



175 where = means equality up to boundary terms, as defined in (|13[) . 
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Proof. The assumption for non-zero determinant is imposed so that the transition probabilities and hence the 
GC action functional are non-singular. The proof is then a straightforward computation using (fT9|) and (|10[) . 



n-1 



W(n; At) := ^ [log n^, x i+1 ) - logn(xi + i,Xi)] = ^ [logZ(x i+ i) - logZ(a 



— y 

2At ^ 

i=0 



i=0 i=0 

n-1 r 



{Ax, + -X(xi)VV(xi)At - \fjY,(x l )At) T Y,- x (x l ){Ax l + ^E( Xi )VV{xi)At - -VE( Xi )At) 



-{-Ax, + ^E(x i+1 )VV{x i+1 )At - ivE(x l+1 )At) T S- 1 (a; l+1 )(-Aa; l + ^E(x i+1 )VV(x i+1 )At - ^VE(x i+1 )At) 

n-1 

= _ 2Ai 5- MS"^)^ + j WfofEfcOWfoJAt 2 + -VE(x l ) T E" 1 (x l )VI](x l )Ai 2 
+ Aacf W(x 4 )At - Aa^E^a^VE^At - -W^^VE^Ai 2 

- AxfZ-^Xi+^Axi - - A VV{x t+1 ) T ^{x l+1 )VV{x t+1 )At 2 - JvE^fE-^+OVE^+OAr 2 
+AxJVV(x l+1 )At - Axj E^ 1 (a; 4+1 )VI](a; 4+1 )At + i W(a; l+1 ) T V£(x l+1 )Af/ 

71—1 - U — 1 



= - ^ E A ^ - E'Wi)] Aari - i ^ Azf [VK(ivn) + VV( Xi )] 

i=0 2=0 

^ n—1 



i=0 

176 where all the terms of the general form G{xi) — G{xi + \) in the sums were cancelled out since they form telescopic 

177 sums which become boundary terms. □ 

178 Three important remarks can readily be made from the above computation. 

179 Remark 2.2. The numerical computation of entropy production rate as the time-average of the GC action 
wo functional on the path space (i.e., based on (|9])) at first sight seems computationally intractable due to the large 

181 dimension of the path space. However, due to ergodicity, the numerical computation of the entropy production 

182 can be performed as a time-average based on (jlip and (|20p for large n. Additionally, this computation can 

183 be done for free and "on-the-fly" since the quantities involved are already computed in the simulation of the 

184 process. The numerical entropy production rate shown in the following figures is computed using this approach. 

185 Remark 2.3. It was shown in 13J that the GC action functional of the continuous-time process driven by (116[) 

186 equals the Stratonovich integral 

W cont (t) = - f VV(X S ) o dX s = V(x ) - V{x t ) (21) 
Jo 

187 which reduces to a boundary term as expected. This functional has the discretization 

n-1 

2 



^ n—1 

W cont (t) Azf [W(a*+i) + VV(xi)] (22) 



i=0 



188 and this is exactly the first term in the GC action functional W(n; At) for the explicit EM approximation 

189 process (see ([20]) V However, the discretization scheme introduces two additional terms to the GC action 
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190 functional which may greatly affect the asymptotic behavior of entropy production as At goes to zero, as we 

191 demonstrate in Section 12.21 Notice that when the noise is additive, i.e., when the diffusion matrix is constant, 

192 then these two additional terms vanish and taking the limit At — > 0, the GC action functional W(n;At), if 

193 exists, becomes the Stratonovich integral W cont (t) which is a boundary term. 

194 Remark 2.4. The GC action functional W(n; At) consists of three terms (see (|2T))) ). each of which stems from 

195 a particular term in the SDE. Thus, each term in the SDE contributes to the entropy production functional 

196 a component which is totally decoupled to the other terms. The reason for this decomposition lies in the 

197 particular form of the transition probabilities for the explicit EM scheme which are exponentials with quadratic 

198 argument. This feature can be exploited for the study of entropy production of numerical schemes for processes 

199 with irreversible dynamics. Indeed, if a non-gradient term of the form a(X t )dt is added to the drift of (fTB")) . the 

200 process is irreversible and its GC action functional is not anymore a boundary term and is given by |13] 

rt i n-i 

W cont (t)= - / S- 1 (X t )a(X i ) o dX t « - J2 AxJlZ-^XiWxt) + S" 1 {x i+1 )a(x i+1 )} (23) 

i=0 

201 On the other hand, due to the separation property of the explicit EM scheme, the GC action functional of the 

202 discrete-time approximation process W(n; At) has the additional term 

n— 1 

- &xT\E- l (xi)a(xi) + E-^+iMzi+i)]. (24) 

203 Evidently, the discretization of W con t(t) equals the additional term of the GC functional W(n; At). Thus, GC 

204 action functional W(n; At) is decomposed into two components, one stemming from the irreversibility of the 

205 continuous-time process and another one stemming from the irreversibility of the discretization procedure. 



206 2.1. Entropy Production for the Additive Noise 



207 An important special case of (|16|) is the case of additive noise, i.e., when the covariance matrix does not 



208 



depend in the process, E(x) = S. In this case, the SDE system becomes 



dX t = ~EVV(X t )dt + adB t (25) 



X ~ [i 

209 and the GC action functional is simply given by 



n— 1 

W(n; At)= Axf[VV(x i+1 ) + VV( Xi )} (26) 

210 In this section we prove an upper bound for the entropy production of the explicit EM scheme. The proof 

211 uses several lemmas stated and proved in Appendix [A] 

212 Theorem 2.5. Let Assumption hold. Assume also that the potential function V has bounded fifth- order 

213 derivative and that the covariance matrix E is invertible. Then, for sufficiently small At, there exists C = 

214 C(V, £) > such that 

EP(At) < CAt 2 (27) 



10 TITLE WILL BE SET BY THE PUBLISHER 

215 Proof. Utilizing the generalized trapezoidal rule ([751) for k = 3, the GC action function is rewritten as 

n — l 

W(n; At)= Axf[VV(x i+1 ) + VV( Xi )} 

n-l f 

{ -(V(x l+1 ) - V( Xi )) + J2 C a [D a V(x l+l ) + D a V{ Xl )]^ 

' [ M=3 



1=0 



|a|=l,3,5 \P\=5-\a\ 
n-l 

= J2 E C a [D a V(x i+1 ) + D a V(xi)]Axf 

i=0 |a|=3 
n-l 

+ E E E B [Ri(x i>Xi+1 )+R^x i+1>Xi )]Ax^ . 

i=0 a| = l,3,5 \/3\=5-\a\ 

216 Applying, once again, Taylor series expansion to D a V (xi+i) , the GC action functional becomes 



W(n;At)=^2 \ E 2C a D a V( Xi )Axf + £ C Q £ ^ Q+ ^(^)A 

i=0 [|a|=3 |a|=3 |/3| = 1 

n-l 

+ E E E ^(^,^+i)Ax^ 

i=0 |a| = l,3,5 |/3|=5-|cc| 

217 where ^(a^a^+i) = B j3 [R^(x l ,x i+ i) + R l ^(x i+ i,x l )] + t\ a \ =3 R^(xi,x i+ i). Moreover, expanding Aa 

218 the multi-binomial formula 

Axf = (-^yV( Xi )At + aAWi) a = £ ( a )(-^V(x i )At) l '(aAW i ) a - / . 



219 Then, the GC action functional becomes 



n-l , v 



i=0 \a\=3v<a 

+ EE E E ^("^^^f^oc-^vf^)^)^^)^- 1 ' 

| a |__3 |/9|=1 ^<a+/3 V / 
n—l ✓ ,. 

+ E E E E r^j^^.^of-iEw^jAtr^AWi)^-''- 

i=0 |a|=l,3,5 \p\=5-\a\ is<a+/3 ^ ' 
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220 From (fTTj) . the entropy production rate is the time-averaged GC action functional as n — > oo. Thus, 
EP(At) = hm W ^ 

n->oo nAt 

n-1 



= ir E E c 4 j lim -E^^)(-^ w ^) A ^) ,y ( ffA ^) a " ,/ 

Ar ' — ' * — ' V v I n->oo n ^ — ' 2 
|a|=3i/<a v 7 i=0 

1 / 4- ft\ 1 " _1 1 (^) 

+ 7wE E E C 4 lim - Vz?«+^(x 4 )(--SV^)At)^aA^) Q+ ^ 

|o,|=3 101=1 w<a+/8 v 7 i=0 

4 E E E ( a+p ) lim -E^^'^)(-^ w ^)Air( ( xA^r + ^. 

Ar * — ' * — ' * — ' V v I n->oo n *■ — ' 2 

|a|=l,3,5 |0|=5-|a| v<a+P v 7 i=0 

221 The ergodicity of cc^ as well the Gaussianity of A Wj guarantees that the first two limits in the entropy production 

222 formula exist. Additionally, the residual terms, Xj+i), are bounded due to the assumption on bounded 

223 fifth-order derivative of V, hence, the third limit also exists. Note here that this assumption could be changed by 

224 assuming boundedness of a higher order derivative and performing a higher-order Taylor expansion. Appendix IA"1 

225 gives rigorous proofs of these ergodicity statements. Hence, 

EP(At) = A £ ^C a ^E |i [£>-V( a; )(-isVV( 3! )A*)'']E p [( 01 /) '-^ 

|ck|=3i><ck ^ 7 

+ 7A7E E E C a ( a yy4D a ^V(x)(~^VV(x)Atr}E p [(ayr + ^} 

|a|=3 \p\=lv<cc+P ^ 7 

+ A7 E E E ^yV^xp^^^C-isvy^A^^E.Ka?/)^ 



(33) 



At ^ ^ ^ \ v r"*"'-»^"^ 2 

|o|=l,3,5 |(8|=5-|a| i/<a+/3 x 7 

226 where /I is the equilibrium measure for xi while p is the Gaussian measure of AWj. Using the Isserlis-Wick 

227 formula we can compute the higher moments of multivariate Gaussian random variable from the second-order 

228 moments. Indeed, we have 

E^=E|^...^]=E[^...z M ] = { En J M I H ^ (34) 

229 where Y] ]~[ means summing over all distinct ways of partitioning zi,...,z\ v \ into pairs. Moreover, E[z<2j] = 

230 EyAt, hence, applying (1341) into (1331) and changing the multi- index notation to the usual notation, the entropy 

231 production rate becomes 

2 d d d ( 8 3 v ^ 

EP(At) ^EEE <W {e.[ (-^VV) fcl]St3t ,A^ 

+E P [ „ fl a (--EW) fc2 ]E fclfc3 Ar 2 + E p [ (--SVy) fe3 ]E fclfc2 At 2 + 0(Ar 3 ) 

p dx kl ox k2 dx k3 2 p dx kl ox k2 dx k3 2 

i d d d d r -1 
+ At E E E E Ck lk2 k 3 s ][ s feife2 S fe 3 fe 4 + Efc 1 fc 3 Sfe 2 fe 4 + E fclfc4 E fc2fc3 ]At 2 + 0(At 3 ) > 

kl — 1 k 2 — 1 ^3 — 1 /C4 — 1 L ^1 ^4 J 

+ >(A* 3 )- 

(35) 
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232 Using that (— |EW)fc, = -^t 4 =i ^i^ci^ - ' entropy production is rewritten as 

d 3 V dV , , d A V 



dx kl dxk 3 dx ki dx k2 M dx kl ...dx 



a a a a s 

EP(At) = 2J X! X! X! C k 1 k 2 k 3 \ Sfe 1 fe 2 Sfe 3 fe 4 
fc 1= i fc 2 =i fc 3 =i fc 4 =i ^ 

+ S felfe3 S fe2 fe 4 ^ E m [ ^ a . fci 6»a; fc2 9a;fe 4 da;k 3 + Ep dx kl ...dx ki 



&4 



(36) 



233 By a simple integration by parts, we observe that for any combination ki, — 1, <i 

a 3 y ay <9 4 v 

Eai dx kl dx k2 dx k3 dx k J E ^dx kl ...dx k J 

234 where the expectation is taken with respect of /i which is the invariant measure of the continuous-time process. 

235 However, in (I36[) the expectation is w.r.t. the invariant measure of the discrete-time process (i.e., \x instead of 

236 fi). Nevertheless, Assumption [TTT] guarantees that the alternation of the measure from \x to /2 costs an error of 

237 order O(At). Hence, for any coefficient in (|3"6"]l . we obtain that 



d 3 V dV , , d A V 

E H = = - En 



< 2KAt (38) 



^ dx kl dx k2 dx k3 dx ki M dx kl ...dx ki 

238 since the potential V as well its derivatives are sufficiently smooth. Hence, we overall showed that 

EP(At) = 0(At 2 ) (39) 

239 which completes the proof. □ 

240 Remark 2.6. Depending on the potential function the entropy production could be even smaller. For instance, 

241 when the potential V is a quadratic function (i.e. the continuous-time process is an Ornstein-Uhlenbeck process), 

242 then, it is easily checked by a trivial calculation of (1261) that the GC action function is a boundary term, thus, 

243 the entropy production of the explicit EM scheme is zero. However, for a generic potential V we expect that 

244 the entropy production rate decays quadratically as a function of At but not faster. 

245 2.1.1. Fourth- order potential on a torus 

246 Lets now proceed with an important example where the potential is a forth-order polynomial while the 

247 process takes values on a torus. Assume d — 2 while potential V = Vp is given by 



v ^ x )=py-f- L ir) ( 4 °) 

248 where /3 is a positive real number which in statistical mechanics has the meaning of the inverse temperature. 

249 The diffusion matrix is set to a = y / 2(3~ 1 Id- Based on [TS], Assumption 1 1 . 1 1 is satisfied because the domain is 

250 restricted to a torus, the potential is locally Lipschitz continuous and the covariance matrix is elliptic. Figure [I] 

251 presents both the GC action functional (upper panel) and the entropy production rate (lower panel) as a 

252 function of time for fixed At = 0.05. Both quantities are numerically computed while the inverse temperature 

253 is set to /3 = 10. Even though the variance of the GC action functional is large, entropy production which 

254 is the cumulative sum of the GC functional converges due to the law of large numbers to a (positive) value 

255 after relatively long time. Additionally, due to the ergodicity assumption, it converges to the correct value. 

256 Figure [2] shows the loglog plot of the numerical entropy production rate as a function of At for /3 = 20, 40, 60. 
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Figure 1. Upper panel: The GC action functional as a function of time for fixed At = 0.05. Its 
variance is large necessitating the use of many samples in order to obtain statistically confident 
quantities. Lower Panel: The entropy production rate as a function of time for the same At. 
It converges to a positive value as expected. 



257 Final time was set to t = 2 • 10 6 while initial point was set to one of the attraction points of the deterministic 

258 counterpart. For reader's convenience, the thick black line denotes the 0(At 2 ) rate of convergence. This plot is 

259 in agreement with the theorem's estimate (|27|) at least for small At while for larger time-steps (i.e. At > 0.1) 

260 the rate of entropy production is of order 0(At 3 ). Notice also that, for small At, entropy production rate is 

261 very close to and even larger final time is needed in order to obtain a statistically confident numerical estimate 

262 for the entropy production. Moreover, as it is evident from the figure and the GC action functional in ()26[) . 

263 the dependence of the entropy production w.r.t. the inverse temperature is inverse proportional. Thus, from a 

264 statistical mechanics point of view, the larger is the temperature the larger -in a linear manner- is the entropy 

265 production rate of the numerical scheme. 



266 2.2. Entropy Production for the Multiplicative Noise in Id 

267 For the multiplicative overdamped Langevin process, we restrict to the 1-dimensional case. The reason for this 

268 restriction is that we apply not only the EM scheme but also a higher-order scheme (Milstein's) which becomes 

269 complicated for general diffusion matrices in higher dimensions. Nonetheless, the results and conclusions of this 

270 subsection for both explicit EM and Milstein's schemes are valid in a more general, multi-dimensional setting 

271 where the diffusion matrix a{x) is diagonal. 

272 In order to study the entropy production rate of the explicit EM scheme for the overdamped Langevin process 

273 with multiplicative noise, the remainder terms of the GC action functional should be studied. In this direction 

274 we can rewrite the GC action function as it is given by the Lemma 12.11 for lei 

n — 1 -,n—l 

/ , L - .-,-r,. . ■ ... (/J — , / ^ L T,~ 1 (x i+ i)'E'(xi + i) + S~ 1 (xj)S / (a;i)]Aa;i 



W{n- At) = - ~ J2[V'(xi+i) + V'ixifiAxi + 1 ^p- 1 ^)^) + ^ {x^' ( Xl )]A 

i=0 i=Q 
1 "- 1 

+ ^ ]T [E^fo+i) - E-^a*)] Ax} =: W 1 (n; At) + W 2 (n; At) + W 3 (n; At) . 



i=0 
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Figure 2. Entropy production rate as a function of time step At for additive noise. The 
entropy production rate is of order 0(At 2 ) for small At while it decreases linearly as a function 
of inverse temperature /3. 



275 The entropy produced from Wi(n; At) was computed in the previous section and after an interesting and rather 

276 unexpected cancellation it was proved to be of order 0(At 2 ). For the multiplicative case, a cancellation also 

277 occurs (see (|45f and (|46|) below) but it does not fully eliminate the lower order term. In any case, Wi(n;At) 

278 contributes to the entropy production O(At). Additionally, W2(n; At) is also the sum of a gradient term since 

279 variance £(x) £ M and holds = (log Hence, assuming suitable condition on E(x), the 

280 same computation as for Wi(n; At) applies and the entropy production rate stemming from W2(n; At) is also of 

281 order O(At). However, W%{n; At) contributes to the entropy production with a positive term which is of order 

282 O(l). The following theorem summarizes the behavior of entropy production rate for the explicit EM scheme 

283 for multiplicative noise. 

284 Theorem 2.7. Let Assumption hold. Assume also that the potential function V has bounded fifth- order 

285 derivative while there exists M > such that T,(x) > M , Vx. 

286 (a) Let c — |E Al [(S _1 )(x)(E') 2 (x)] 7 then, for sufficiently small At, there exists C — C(V, S) > independent 

287 of At such that 

\EP(At) - c| < CAt (42) 

288 (b) Assuming that E M [(E _1 )(x)(E') 2 (x)] ^ 0, then, for sufficiently small At, there exists a lower bound c' — 

289 c'(V,Ti) > independent of At such that 

c < EP(At) (43) 



290 



Proof. Assumption S(x) > M^ 1 Vx, which is the ellipticity condition applied in Id, is necessary because it 
makes E _1 (x) as well its derivatives bounded around 0. Additionally, both W\{n; At) and Wiiji; At) contribute 
to the entropy production by a O(At) amount which does not affect the proof of the theorem hence they are 
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eliminated. Thus, concentrating to W 3 (n; At), after a Taylor series expansion we have 

1 n-l r -. -. n-l „i 

W 3 (n;At) = — J2 (S- 1 )'(^)A^ + -(E- 1 )"(x l )Axf + — £ / (1 - t)^)'" {tx i+l + (1 - t)xi)dtAx\ 

i=0 J— n ^° 



j=0 



n-l 3 



= ^EEU) (S- 1 )'(xi)(-^(x i )^(^)Af + ^'(x^Atfia^AWrf- 1 * 



2At ^ ^ \k 

n-l 4 



h E E U ) (S- 1 )"(^)(-is( ; r i )F'( a;i )Af + iE'faJAiJ^xOAWi) 



4-fc 



4At Vfc 

i=0 fc=0 



n — 1 5 /r \ p \ 
5 N 



E E ( " ) / ( X - *)(£~T(^+i + (1 - <)x i )^(-^S( a ; 2 )^'(^)At + ^'(xOA^^xOA^) 5 ^ 



2At \kj J 

i=0 fc=0 v ; ^ 



291 As in Theorem 12.51 applying the ergodic lemmas of the appendix, the entropy production rate stemming 

292 from W 3 (n; At) equals to 



EP 3 {At) = lim 



W 3 (n;At) 



t^oo nAt 



fc=0 ^ ' 

+ 4^ E ^W(£-T(z)(-i S (^' 0*) A * + (x)At) k <T(x) 4 - k ]E p [AW 4 - k ] 

k=0 ^ ' 
5 

2A ^ E Epxp^i, y)(- ^(^^(^Ai + l^(x)At) k a(x) 5 - k ]E p [AW 5 



fc=0 



2At 2 



-^[(E- 1 )'(x)E 2 ( a ;)T/'(x)]At 2 + ^[(I]- 1 )'(x)E'( a ;)S(x)]At 2 + 0(Ai 3 ) 



+ [E p [(^)"(x)E 2 (x)]3At 2 + 0(At 3 )] + ^0(Ai 3 ) 



4A£ 2 
3 



1 



-^[{^)'{x)^{x)V'{x)] + ~E^(£-7(s)(£T(*)] +]E Al [(S- 1 )"(x)S i (x)] 

293 On the other hand, it holds for the invariant measure /i that 

E M [(S- 1 ) / (a;)S 2 (ar)y , (^] = E^ST 1 )"^ 2 ^)] + E M [(E _1 )'(a:)(E 2 ) / (a;)] 

294 Thus, using the error estimate (|15[) of Assumption 11.11 as in the additive case, we obtain that 



0(At) 



(44) 



(45) 



EP 3 (At) = --E,l(X-t)'(x)(^y(x)]+0(At) 
^EP 3 (At) - ^[(E-iJOrXE') 2 ^)] = O(At) 



(46) 



295 which concludes the proof of (a) . (b) is a direct consequence of (a) 



□ 
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296 2.2.1. Quadratic potential on K 

297 Let V(x) — \- be a single-well quadratic potential while the diffusion term is given by 

*«(*> = (47) 

298 The choice of the diffusion term is justified by the fact that we can control its variation in terms of x and sending 

299 e to zero, the additive noise case is recovered. The invariant measure of this process is the Gaussian measure 

300 with zero mean and variance one. This invariant measure is the simplest measure to be considered. Moreover, 

301 all the assumptions of Theorem 12 . 71 are satisfied thus we expect a O(l) behavior of the entropy production rate 

302 at least for small At. Indeed, Figure |3] shows the behavior of the numerically-computed entropy production 

303 as a function of At and it does not decrease to zero as At tends to zero. Consequently, explicit EM scheme 

304 for multiplicative noise totally destroys the reversibility property of the discrete-time approximation process 

305 independently of how small time-step is utilized. Additionally, notice that as e decreases, entropy production 

306 decreases, too. This is also expected since cr(x) — » a — const, as e — > and in combination with the quadratic 

307 potential V, EP(At) — > as e — > for any At sufficiently small. 




Figure 3. Entropy production rate as a function of time step At for multiplicative noise and 
the explicit EM scheme. As Theorem 12.71 asserts, entropy production does not decrease as At 
is decreased. This results in a permanent loss of reversibility which cannot be fixed by reducing 
the time step. Star symbols denote the theoretical value of the lower bound as it is given by 
the Theorem (i.e., c' ~ c = |E M [(E £ ) _1 (x)(T,' e ) 2 (x)]). The agreement between the theoretical 
and the numerical values is excellent. 



308 2.2.2. Milstein's scheme 

309 Since the EM scheme has entropy production rate which does not decrease as At decreases, an immediate 

310 question to ask is what happens when a higher-order scheme is applied. Milstein's scheme is the next higher- 

311 order scheme 10. IT and its explicit version is given by 

Xi+i = Xi - ^S(xi)V'(xi)At + ~g(xi)At + a{ Xi )AW t + -a( Xi )a'(xi)(AW^ - At) 
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312 which is rewritten as 



(49) 



313 Since AWi is zero-mean Gaussian random variable with variance At, the transition probability for Milstein's 
3w scheme is 

2\ 



1 



y/2nAtZ(x l , Aa 





( 1 


exp 


, 2At 



-a(xi) + y/Zfa, Ax t ) 



I TV 
2 



exp 



1 

~2At 



a(xi) + ^Z{x t ,Axi) 



(50) 



315 where 



Z(x h Axi) = E(a;i) + E'fc) ^Ax» + is(a; i )F'(x i )Ai - ^'(a^At) 



(51) 



316 Notice also that Z(xi,Axi) = (cr(xi) + |£'(xi)AWj) 2 > which is always non-negative while the transition 

317 probability density is rewritten as 

1 



U(xi,Xi + i) = 



y/2wAtZ(xi,Aa 



: exp 



2{H{x l ) + Z{x l ,Ax. l )) 
At(Vy{ Xi ) 



cosh 



y/Z(Xi)Z(Xi,AXi) 

A£E'(xi) 



(52) 



3i8 Thus, the GC action functional for Milstein's scheme equals up to boundary terms to 



W(n; At) = 



1 n—1 

5 EN 



Z(xi, Axi) 



k=0 



Z{x 



i+lj 



-Axi 



At ^ 



fc=0 



Z(xi,Axi) Z(x 



-Axi) 



(E') 2 (x,) (S') 2 (x i+1 ) 



n-l 

E 

fe=0 L 



log cosh 



y/Z(x l , Axj 

2Ata'(xi) 



log cosh 



y/Z(x i+1 , -Axi) 



2Ata'{x l+1 ) 



(53) 



319 We can test the behavior of the entropy production numerically since, as we already stated, averaged GC action 

320 functional provides under ergodicity assumption an estimate for the entropy production rate. Figure |4] shows 

321 the numerically computed entropy production for the same example shown in Figure El Evidently, entropy 

322 production rate decreases linearly as time step At is decreasing. Additionally, a number of different variance 

323 functions which satisfy the condition of Theorem 12.71 were tested and in all cases the decrease of the entropy 

324 production for the Milstein's scheme was linear. Thus, we conjecture that entropy production of overdamped 

325 Langevin process with multiplicative noise is of order O(At) for Milstein's scheme. 



326 3. Entropy Production for Langevin Process 

327 Let us consider another important class of reversible processes, namely the processes driven by the Langevin 

328 equation 

dq t = M-iptdt 

(54) 

dpt = ~VV(q t )dt - 1 {q t )M- 1 p t dt + a{q t )dB t 

329 where qt € M. dN is the position vector of the N particles, pt G M. dN is the momentum vector of the particles, M 

330 is the mass matrix, V is the potential energy, 7 is the friction factor (matrix), a is the diffusion factor (matrix) 

331 and B t is a diV-dimensional Brownian motion. Even though the Langevin system is degenerate since the noise 

332 applies only to the momenta, the process is hypoelliptic and is ergodic under mild conditions on V and a. The 
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Figure 4. Entropy production rate as a function of time step At for the explicit Milstein's 
scheme. The decrease of the entropy production rate for this numerical scheme is linear. Thus, 
in a loose sense, the reversibility property of the original continuous-time process is restored. 



333 fluctuation-dissipation theorem asserts that friction and diffusion terms are related with the inverse temperature 

334 /?(•) G K of the system by 

(<J<J T )(q t )=2(3- 1 (qth(qt)- (55) 

335 If fi{qt) — P is a constant, the Langevin equation is reversible (modulo momenta flip, see (|58jl ) with invariant 

336 measure 

fi(dq, dp) = — cxp (-/3H(q,p)) dqdp. (56) 

337 where H(q,p) is the Hamiltonian of the system given by 

H(q,p) = V(q) + ^p T M- 1 p. (57) 

338 Indeed if C denotes the generator of (|54|) , it is straightforward to verify the following modified DB condition 

< £f(q,p),g(q,p) >l^(^< f(q,-p),£g(q,-p) >l^(^ (58) 

339 for any test functions / and g which are bounded, twice differentiable with bounded derivatives. This shows 

340 that the Langevin process is reversible modulo flipping the momenta of all particles. 

341 An explicit EM-Verlet (symplectic)-implicit EM scheme is applied for the discretization of (f54"|) . It is written 

342 as 

q i+1 = q t + M-^iAt (59) 

Pi+i =Pi+± ~ W(gi+i)y -7(gt <+ i)M _1 p i+1 — - + a(q i+1 )AW i+ i 

343 with AWi, AW i+ i ~ N(0, ^rldN)- This numerical scheme also known as BBK integrator [11[T2] utilizes a 

344 Strang splitting. Its stability and convergence properties were studied in while its ergodic properties can 

345 be found in [14,15,26 . An important property of this numerical scheme which simplifies the computation of the 



Pi+k =Pi - VI%;)— - l(qi)M- 1 Pl — + a(qi)AWi 
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346 transition probabilities is that the transition probabilities are non-dcgcncratc. We rewrite the BBK integrator 

347 as 

q i+1 = q t + M- 1 ^ - W(%)-^ - jiq^M^Pi^At + M' 1 o-(q t )AtAW l (60a) 

348 

Pi+i = (I + Tfe+OM-^r^Mfe+r-^-Vn^ (60b) 

349 and thus the transition probabilities of the discrete-time approximation process are given by the product 

H(qi,pi,q i+ i,pi + i) = P{q l+1 \ qi ,p l )P{pi +1 \q i+1 ,q l ,pi) (61) 

350 where P(pi + i\qi,pi) is the propagator of the positions given by 

P{q i+ i\quPi) = ex P {^ (A® + M" 1 ^ - W(g<)^ + ^M^p^Atf 

(<7M- T M-V T )- 1 fe)(Ag i + M- 1 fe - VV( qi )^- + 1 {q l )M- 1 p~)At)} (62) 

351 where Aqi = q%+\ — qi while P(p,-_|_i|gj_|_i, qi,Pi) is the propagator of the momenta given by 

P(Pi+i\q i+ i,qi,Pi) = Zl(g 1 , t+1 ) ex P^^+ 1 - ( / + 7fe+i)M" 1 ^)" 1 (^AfA (?l - Vy( %+1 )^)) T 
(a T (J + 7 M)- T (7 + 7 M- Vr'fe+iXK+i - (1 + Tfe+iJM^yJ-^MA,, - Wfe+i)^))} (63) 

352 Finally, since the Langevin process is reversible modulo flip of the momenta, the GC action functional takes the 

353 form 

W(n; At) = y log n fa>^+i,Pm) (64) 

~^ "-{qi+i,-Pi+i,<n,-Pi) 

354 3.1. Langevin Process with Additive Noise 

355 In the following, even though the general case can be handled, we restrict for clarity to the simpler additive 

356 noise case. Thus, we assume that o~(qi) — <rl, ^{qi) = 7/ as well that particles have equal masses (M = ml). 

357 Starting as in the previous section with the GC action functional, the next lemma is stated and proved. 

358 Lemma 3.1. The GC action junctional of the BBK integrator equals to 



W(n;At)=^-J2 
mAt 

i=0 

359 

360 Proof. Firstly, (j6"2")l and (|63l) are rewritten as 



n-1 r At 2 
ApjAq l -VV(q l ) T p l — 
2m 



(65) 



and 



P(q i+l \q h Pi) = -Lexpj^— |A ft + (pi - -VV{ qi )^- + ^Lp~)At\ 2 \ (66) 
Zq I a 1 At* m 2 m 2 ' 



1 _f 1 !/■■., 7 A ^„ 7 m A„ A ^V7T ..!-> 



P{jp i+1 \q i+u q h Pi) = — cxp + ^)ftfi - (^A ft - _Wfe +1 ))| 2 j (67) 
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respectively. Then, as in the overdamped Langevin case, the computation of the GC action functional is 
straightforward , 



W(n; At) 



a 2 At 3 



E 



i=0 



At^w N A t^ 7 At N 

Aft + -^—VV( qi ) (1 - l— ) Pi 

2m m 2m 



- n—l 
VU E 



a 2 At 



i=0 
,2 



7At N to . At„ , 

1 + ^—)Pi+i - TrrAg, + —VV{ q%+1 , 
2m At 2 



At 2 nT „ , At,, 7 At , 
-Aft + — W(ft +1 + —(1 - ^— ft+i 
2m to 2to 



,„ jAt. to . At . , 



o- 2 At 3 



E 



|A«z,:| 2 + I f^wfe)! 2 + 1— (i - ^)p,\ 2 + —A q Jvv( qi ) 

2m to 2m to 



2 At,, jAt. A T At 3 ,, 7 At. 
1 - j —)AqTp,. t- 1 ~ 4— 



TO 



I Aft 



At 2 
'"2m 

2At n 7At^ A T 
~2m 



m^ ' 2m 
i2 |A< (1 7At, 
to 2to 
At 3 

T2" 



A+2 

Wfe +1 )| 2 - | — (1 - ^)p 1+ i| 2 + Aq?VV(q., 

TO 



7 At. 



— (1 - ±- )Aq[p t+1 - -Ul - -L-l)W(ft +1 ) r p i+1 



or 



2m 



n—l 

^E K 1 + 1^ W + I a7 a *I 2 + l-v^(ft + i)l 2 - (i + t^) a7^ +1 a, 



a 2 At 



i=0 



'At 



2m 7 At 



(1 



" At -)AtpJ +1 \7V( qi+1 ) - mAqfVV(q i+1 ) 



2m 
7At, 
2m 



— " r> i m a 19 i At ^xr/ m9 /, 7At,2m rp 

- K 1 + -zzr)Pi\ ~ I a^ a *I ~ I— w (ft)l + (1 + t^t)— pfA & 



,At. 

T 



2m ' At J 



+ (1 + 2^L)AtpJVV(q t ) - mAgf W(ft) 
2m 



Thus we have, 



W(n;At)= 



2 At 3 



E 



At 2 



2At 7At. A T 



Aq W ft) + VV( ft+ i)) + (1 - ^— )Aft^ A ft 

771 TO 2TO 



- ^ (1 - ^)(W(q t+1 ) T p l+1 + Wfefft) 
TO 2to 



-(1 



" At )A*(pfVT/(ft)+ft r +1 Vy(ft+i)) 



2to 
2to 

CT 2 At 2 



r 



E 

i=0 



-(1 - ^)Aq?A Pi + (1 + ^)AqjA Pi 
2m 2m 



~(1 ~ ^)(Vl/(ft+i) T ft+i + VV( qi f Pl ) - + ^)(W(?, + i)Vi + W( 9i ) T Pi) 

2to 2to 2to 2m 



2~ 



mer 2 At 



E 



At 2 

Apf A ft - — (Vy(g I+ i)Vi + W( 9l ) T Pl ) 
2to 



362 which is equal, up to boundary terms, with ([65 



□ 
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363 Remark 3.2. Proceeding as in Remark 12.31 we can compare the GC action functional of the BBK integrator 

364 to the GC functional for the additive Langevin process with constant temperature, which is given, |13j . by 



W cont (t) 



f VV(q t )p t dt « ^ V VV( qi ) T p. 



(68) 



365 and is a boundary term in continuous time. Comparing the GC functionals, it is evident that the discrete 

366 version of W con t(t) is contained in the functional W(n; At) given by (|65|) . This is similar to the overdamped 

367 Langevin case when discretized utilizing the explicit EM scheme. In addition the remaining term in the GC 

368 action functional W(n; At) stems from the Strang splitting of the numerical scheme. Moreover, this additional 

369 term critically affects the irreversibility of the discrete-time approximation process since it is the leading order 

370 term in the entropy production rate, as shown in the following theorem. 

371 Theorem 3.3. Let Assumvtion \l.l\ hold. Assume also that the potential junction V has bounded fifth- order 

372 derivative. Then, for sufficiently small At, there exists C — C(iV, 7, m) > such that 



EP(At) < CAt 



(69) 



374 Proof. Solving (160b .) for pi, changing the index from i + 1 to i in (|60b ) and adding them, the momenta equal to 

Pi 

375 Then. 



^(Ag, + Agi-x) + |(AWi_i - AW*) 



376 hence the GC action functional becomes 

n-1 



Apf A qi - VV( qi ) T Pl 



2/3 
mAt 



i=0 
n-1 r 

E 



At 2 
2m 



i=0 



(^(A*+i - Ag^O + |(-AW i+1 - AW l+ i + AW, - AW,. 



VV-(g i ^(^(Ag i + Ag i _ 1 ) + |(AWi_ i 



A£ 2 
2m 



2/3 



n-1 



mAi 



^ [~(AWi - AWi_ 1 ) T Ag 4 - W(ft) r (A ?i + Ag 4 _x)At 



i=0 
n-1 



/3<7 



i=0 
n-1 



E( vl/ (*+i)Vi + W( ft )) T A ?i 



i=0 



Agi 



(70) 
(71) 



(72) 



377 where = means equality not only up to boundary terms but also up to statistical independence which does not 

378 affect the value of the entropy production rate, either. 

379 The second sum of GC action functional has exactly the same form as in additive overdamped Langevin 

380 equation and adapting the arguments of Theorem 12.51 it can be proved that the entropy production rate for 
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381 
382 



this term is of order 0(At 2 ). The first term is treated similarly, but since an additional cancellation occurs we 
provide the details. The first sum in (|72|) equals to 



^ J2(A Wi - AW t _,) T ( (1 - ^) Pi Wfe)^ + aAWi 



in 



(3a 



i=0 
n-1 



E 



in ^ * j 



(73) 



-(1-^)(1 + ^V|A^| 2 
2m 2m 2 



383 Hence, the total entropy production rate becomes 



EP{At) 



m 2 At noo n ^ v 2m A 2ro/ 1 J ~ 2 ' 
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27 
m 2 At 

iV 7 2 



-(1 



i=0 

, 7 At 
2m 



+ 0{At 2 )) 



2m 2m 

NAt NAt 



\AWi 



0{At 2 



0{At 2 



(74) 



At + 0(At 2 



384 which completes the proof. 



□ 



385 3.1.1. Quadratic potential on a torus 

386 The conclusions of the above theorem are validated by a numerical example where the potential function is 

i 2 

387 quadratic, V(x) = 4~. Figure [5] shows the behavior of numerical entropy production rate as a function of At 

388 computed as the time-average of the GC action functional. Number of particles was set to TV = 5 while the 

389 mass of its particle was set to m = 1. The variance of the stochastic term was set a 1 = 0.01 while the final time 

390 was set to t = 2 • 10 5 . The initial data was chosen randomly from the zero- mean Gaussian distribution with 

391 appropriate variance. Notice also that due to the quadratic potential of this example Gaussian distribution 

392 is also the invariant measure of the process. Thus, the simulation is performed at the equilibrium regime. 

393 Evidently, the entropy production rate is of order O(At) as it is expected. Additionally, we plot (stars in the 

394 Figure) the leading term of the theoretical value of the entropy production rate as it given by (1741 . Apparently, 

395 the theoretical coefficient, — lr, is very close to the numerically-computed coefficient. Finally, notice that the 

396 entropy production rate is quadratically proportional to the friction factor 7 which is in accordance with (j74[) . 



397 4. Summary and Future Work 

398 In this paper, we introduce the entropy production rate as a novel tool to assess quantitatively the (lack of) 

399 reversibility of discretization schemes for various reversible SDE's. Reversibility of the discrete-time approxi- 

400 mation process is a desirable feature when equilibrium simulations are performed. The entropy production rate 

401 which is defined as the time-average of the relative entropy between the path measure of the forward process 

402 and the path measure of the time-reversed process is zero when the process is reversible and positive when it is 

403 irreversible. Thus, it provides a way to quantify the (ir)reversibility of the approximation process. Moreover, 

404 under an ergodicity assumption, entropy production rate can be computed numerically on-the-fly utilizing the 

405 GC action functional. This is another attractive feature of the entropy production rate. 

406 We have computed the entropy production rate for overdamped Langevin processes both analytically and 

407 numerically when discretized with explicit Euler-Maruyama scheme. One of the main finding in this paper is 

408 that depending on the type of the noise -additive vs multiplicative- the entropy production for the explicit EM 
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At 



Figure 5. Entropy production rate as a function of time step, At, for various friction factors 7. 
The decrease of the entropy production rate is linear as Theorem 13 . 31 asserts . Additionally, the 
theoretically-computed entropy production rate (star points) perfectly matches the numerically- 
computed entropy rate. 

409 scheme had totally different behavior. Indeed, for additive noise entropy production rate is of order 0(At 2 ) while 

410 for multiplicative noise it is of order 0(1). Hence, reversibility of the discrete-time approximation process does 

411 not depend only on the numerical scheme but also on the intrinsic characteristics of the SDE. The Milstein's 

412 scheme improved the convergence rate of the entropy production rate for multiplicative noise as shown in 

413 numerical simulations. Furthermore, we have computed the entropy production rate both analytically and 

414 numerically for discretization schemes of the Langevin process with additive noise. Specifically, we computed 

415 the entropy production rate for the BBK integrator of the Langevin equation which is an explicit EM-symplectic 

416 (Verlet)- implicit EM numerical scheme. The rate of entropy production was shown to be of order O(At). 

417 This paper offers a new conceptual tool for the evaluation of discretization schemes of SDE systems simulated 

418 at the equilibrium regime. We consider only the simplest schemes here and we will analyze in future work the 

419 behavior of the entropy production for other numerical schemes such as fully implicit EM, drift-implicit EM, 

420 higher-order schemes as well as different kind of splitting methods. Moreover, other reversible or even non- 
421 reversible processes can be analyzed in the same way, in particular extended, spatially-distributed processes. 

422 A particularly interesting example, where the reversibility of the original system is destroyed by numerical 

423 schemes in the form of spatio-temporal fractional step approximations of the generator, arises in the (partly 

424 asynchronous) parallelization of Kinetic Monte Carlo algorithms [24] . pQ. Finally, another possible extension 

425 of this work is to develop adaptive schemes based on the a posteriori simulation of entropy production rate, 

426 which should guarantee the reversibility or the approximate reversibility of the discrete-time approximation 

427 process. In this direction, the decomposition of entropy production functional for Metropolis- adjusted Langevin 

428 algorithms (MALA) [12 , 21 should be further studied and understood. 
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476 Appendix A. Tools for proving Theorem 3.2 

477 Firstly, a generalization of the trapezoidal rule is stated and proved. 

478 Lemma A.l (Generalized Trapezoidal Rule). For k odd, 

k 

V( Xi+ i)-V(xi)= Yl C a [D a V(x i+1 ) + D a V(xi)]Ax? 

|a|=l,3,... , 

+ E E Bp[Ri{x u x i+1 ) + Ri{x i+u x i )]Axt +0 

M = l,3,... |/3|=fc+2-|a| 
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479 where a — (ax, ay) is a typical d- dimensional multi-index vector, D a V(x) = g^i' g^"<j ( x ) * s ^ e a ~th partial 

480 derivative while j d . The coefficients C a are defined recursively by 

C a = - for \a\ = 1 

1 / 1 |Qh2 1 \ ( 76 ) 

481 iii/iiZe the coefficients Bp are also recursively defined by 

Bp = \ for 101=0 

1 1^1 1 ( 7? ) 
B ' 3 = -2 E ~\ B ^ f ° r Ij8| = 2,4,...,fc + 1 

| 7 |=2,4,... 7 ' 

482 Finally, the remainder terms are given by 

483 i^(^,x i+1 ) = M J 1 (l-t)H- 1 J D a +^V r ((l-i) : r i +te <+1 )*. 

484 Proof. The starting point is the usual Taylor series expansion around Xi 

fc+i 



a! 

|a|=l |a|=fc+2 



V(x i+ x)-V(xi)= J2 ±-D a V( Xi )Ax?+ V fl°(xi,x i+1 )A^ (78) 

^ — * ri I z — ' 

485 and around x^+i 

Vfc+i) - = - £ ^D Q V(z <+1 )(-A^) Q - Rl(xi+i,Xi)(-^i) a (79) 

|a|=l ' |a|=fc+2 

486 Adding the two equations we obtain the symmetrized Taylor series expansion for V given by 



k+l 



k 

V(x i+ x)-V(xi) = l ^[D a V(x i+ x) + D a V( Xi )]Axf 



2 ^ a! 1 

|o|=l,3,... 
fe+1 

~\ E ^\D a V{x l+1 )-D a V(x l )]Ax?+ 1 - J2 [Rl(xux i+ x)+R a (x i+ x, Xi )}^ 

a|=2,4,... ' |a|=fe+2 

487 Moreover, generalized trapezoidal formula (|75|) for Z? a V with \a\ even is 

fc — | CK | 

D a V(x l+1 ) - D a V( Xi ) = C- ( [D a+ '>V(x l+1 ) + D a+ ->V(xi)}Ax] 

| 7 |=1,3,... 

fc+2-|a| 

+ E E ^[i?? +/3 (^,x 1+1 ) + i?;; +/3 (x l+1 ,a ;i )]Axf +7 

|7|=1,3,... |0|=AH-2-|e»|-|7| 



(80) 



(81) 



2G 



TITLE WILL BE SET BY THE PUBLISHER 



488 Hence, substituting (j8"Tj) into ([50]) . a recursive Taylor series expansion 

V(x H .i)-V(x i ) = ~ V —AD a V{x i+1 ) + D a V{x i )]A 
2 * — ' on 

|o|=l,3,... 
, fc+1 , k-\a\ 

~2 E E C 7 [^ + ^( a ; l+1 ) + ^ + ^( a ; l )]A a; a+ ' 

H=2,4,... ' I'y |=1,3, 



fc+1 1 fc+2-|" 



I E A E E Bfl^^.iifO+^^ifi.^iAxr^ 



2 ^ a. 

a|=2,4,... |7|=1,3,... |/3|=fc+2-|a|-|7 



i ^ [Rl(x t ,x l+1 )+Rl(x l+1 ,x l )]^ 

|a|=fc+2 

J £ ^[ J D a T/(x i+1 )+ J D Q F(a ;i )]A< 



(82) 



2 , , " 

|a|=l,3,... 

2 E E 7— ^^[^^j + D^OlA^ 

|a|=3,5,... |7|=1,3,... ^ 

5 E E [^(^,^+i)+^(^+i,^)]a< 

|a|=fc+2 |/3|=fc+2-|a| 

1 fc l/3| 1 

2 E E E 

|a|=l,3,... |/3|=fc+2-|a| |7|=2,4,... 



489 is obtained after few rearrangements of the sums. Equating the same powers of (j82|) and ((75|) . the coefficients 

490 Cq. and -Bg are obtained. 

491 Up to now, we present how to compute the coefficients of the generalized trapezoidal formula. A rigorous 

492 proof of the lemma is then easily derived by induction on the order, fc, of (|75[) and proceeding on the reverse 

493 direction of the above formulas. □ 

494 Lemma A. 2. Assume that the discrete-time Markov process Xi driven by 

x i+1 =F( Xi ,AWi) (83) 

495 where AWi are i.i.d. Gaussian random variables is ergodic with invariant measure p,. Then, 

496 (i) For sufficiently smooth function h we have 



lim - Y]/i(a: i ,AWi) = E Axp [/i(a; ) i/)] (84) 

n— >oo Ji * — ' 

497 (ii) For sufficiently smooth functions f and g we have 

lim - V f( Xi )g(AWi) = Ep[f(x)]E p [g(y)] (85) 

n^oo n — ' 



2 = 
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498 (iii) For sufficiently smooth functions f and g and for bounded f holds that 

lim - V f( Xi , AWi)g(AWi) = E pxp [f(x, y)]E p [g(y)} 



(86) 



i=0 



where p is always the Gaussian measure. 



500 Proof. Proving (i) is based on showing that the transition density of the joint process = (xj,AW;) exists 

501 and it is positive. Both are trivial since the transition density is the product of the two densities which are 

502 both positive. Thus, irreducibility for the joint process is proved and in combination with stationarity the joint 

503 process is crgodic. 

504 (ii) is a direct consequence of (i) for h(x,y) = f(x)g(y). 

505 Denoting / = Ep xp [/(x, y)] and g — ¥, p [g(y)], (iii) is proved applying (i) and that 

n — 1 

-^f{ Xi ,AW^g{AW^-fg 
1 =o 

fl ]_ , , , , i 

- Y, /(*<- AW l )g{AW l ) ffri, ^)g + - £ f(x u AW l )g - fg 

n—1 n—1 

<M\-J2 9{AWi) -g\ + \g\\-J2 f(^ A Wi) - f\ 

71 l * 71 f J 



n-1 



n-1 



i=0 



(87) 



i=0 



i=0 



506 since / is bounded (i.e., |/| < M). Hence, sending n — » oo, (iii) is proved. 

507 



□ 



